Docker Swarm

Swarm 在 Docker 1.12 版本之前属于一个独立的项目,在 Docker 1.12 版本发布之后,该项目合并到了 Docker 中,成为 Docker 的一个子命令。目前,Swarm 是 Docker 社区提供的唯一一个原生支持 Docker 集群管理的工具。它可以把多个 Docker 主机组成的系统转换为单一的虚拟 Docker 主机,使得容器可以组成跨主机的子网网络。

初始化swarm

初始化后该节点自动成为管理节点,docker swarm init --listen-addr <MANAGER-IP>:<PORT>

root@JuGuang:/home# docker swarm init
Swarm initialized: current node (693jrftodx1nvglyqhlhevth8) is now a manager.

To add a worker to this swarm, run the following command:

docker swarm join --token SWMTKN-1-5wculuv0eajx1n2egcm5p5zeshd9u8mghocgt4c0tg7hl3xp48-brtf8sedk6q7xznu0s5bd2jie 192.168.1.56:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

如果不指定ip和port,则默认在所有ip上监听2377端口

root@JuGuang:/home# docker swarm init
Swarm initialized: current node (ou9zwv49wcn64gszp1s5ovflh) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-4u4vihapr0ruqvyhlmyzyarzj21gip6m9upo028wdh96rxkeqa-4wistx5wvu4yfb9x4nhpt8hcm 192.168.1.56:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

root@JuGuang:/home# netstat -anp|grep 2377
tcp6       0      0 :::2377                 :::*                    LISTEN      7491/dockerd 

如果指定ip和port,则在指定ip监听指定port

root@JuGuang:/home# docker swarm init --listen-addr 192.168.1.56:2378
Swarm initialized: current node (8z55rcs3gpo9zuvw1toodvn34) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-0gbf0poro8w71x505vfbwmvjyz9hxanynsvpv8ktn7ora4lu25-8d03yt705ycsmfof7b7l59s8c 192.168.1.56:2378

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

root@JuGuang:/home# netstat -anp|grep 2378
tcp        0      0 192.168.1.56:2378       0.0.0.0:*               LISTEN      7491/dockerd    
unix  3      [ ]         STREAM     CONNECTED     17297870 23784/3   

从上面的执行结果可以看到docker提示了如何加入该swarm,在其它docker节点直接执行上面的命令即可加入,后面如果忘记,可以通过以下命令查看:

root@fpi:/home# docker swarm join-token worker
To add a worker to this swarm, run the following command:

docker swarm join --token SWMTKN-1-48ut16wbobyicvdcst3zjlubcc3xoprmgxt8ge4c1hvfkq68ap-5c4c87klns2kbj5acbj1uqsch 192.168.1.48:2377

root@fpi:/home# docker swarm join-token manager 
To add a manager to this swarm, run the following command:

docker swarm join --token SWMTKN-1-48ut16wbobyicvdcst3zjlubcc3xoprmgxt8ge4c1hvfkq68ap-8qqyshayfovf94s45nbbe1uci 192.168.1.48:2377

加入swarm

执行命令docker swarm join --token xxx <MANAGER-IP>:<PORT>

root@JuGuang:/home# docker swarm join --token SWMTKN-1-48ut16wbobyicvdcst3zjlubcc3xoprmgxt8ge4c1hvfkq68ap-5c4c87klns2kbj5acbj1uqsch 192.168.1.48:2377
This node joined a swarm as a worker.

加入成功后可以在对应的管理节点查看到该节点:docker node ls,其中HOSTNAME:fpi-worker-1就是刚加入的节点

root@fpi:/home# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
nibhlxst0t90nunmfpb00aso0 *   fpi-manager-1       Ready               Active              Leader
qsrqqyw7rekm4ftua7uzn9iph     fpi-worker-1        Ready               Active     

注意:一个swarm中可以存在多个管理者,节点可以以管理者的身份加入,像下面的fpi-worker-1这样,它的MANAGER列会被标识为Reachable,而创建此swarm的节点为Leader

fpi@fpi-manager-1:/home$ docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
nfsf7b6wj9yqstz70jfbw5fsc     fpi-worker-1        Ready               Active              Reachable
nibhlxst0t90nunmfpb00aso0 *   fpi-manager-1       Ready               Active              Leader

离开/删除swarm

离开,使用docker swarm leave命令,如果是一个管理节点离开,则会给一个提示,需要使用docker swarm leave --force命令;一个swarm中的所有管理节点都离开后,整个swarm就被删除

root@JuGuang:/home# docker swarm leave
Error response from daemon: You are attempting to leave the swarm on a node that is participating as a manager. Removing the last manager erases all current state of the swarm. Use `--force` to ignore this message.
root@JuGuang:/home# docker swarm leave --force
Node left the swarm.

踩坑记:创建了一个swarm,其中有一个Leader管理节点,然后又加入了另一个普通管理节点,接着删除普通管理节点,整个swarm就挂了,执行任何命令都报错,提示活着的管理节点不足。最后重建整个swarm才得以解决(可能有更好的办法)。

节点升降级

  • 普通节点升级为管理节点:docker node promote
  • 管理节点降级为普通节点:docker node demote

管理端删除节点

docker swarm leave命令是节点离开swarm,但是此节点的信息还记录在管理端,管理端显示此节点的statusdown,要从管理端移除该节点,需要执行命令docker node rm [OPTIONS] NODE [NODE...]

root@fpi-manager-1:~# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
169ut86ul6i7ymgc73ci5gf1z *   fpi-manager-1       Ready               Active              Leader
uujt74sedz62cv6oh87iq3zxf     fpi-worker-1        Down                Active 
root@fpi-manager-1:~# docker node rm fpi-worker-1
fpi-worker-1
root@fpi-manager-1:~# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
169ut86ul6i7ymgc73ci5gf1z *   fpi-manager-1       Ready               Active              Leader

docker service

有关service的详细介绍,可以参见官方文档

在分布式应用程序中,应用程序的不同部分被称为“service”。例如,假如你有一个视频共享网站,它可能包括一个存储应用程序数据的service,一个在用户上传某个东西后在后台进行视频转码的service,一个前端service等,各个服务可以独立部署,共同协作完成业务。引入service的概念可以实现系统的隔离,对服务进行监控、扩展,从而保证系统的高可用性。

在大气网格项目的测试环境中,需要有前端服务(nginx),后端服务(tomcat),风场服务(windfield),zookeeper+kafka,数据库服务(mysql+mongo)等。

service用法如下

root@fpi-manager-1:/home/docker/baseimages/alpine-tomcat# docker service

Usage:	docker service COMMAND

Manage services

Options:
      --help   Print usage

Commands:
  create      Create a new service
  inspect     Display detailed information on one or more services
  logs        Fetch the logs of a service or task
  ls          List services
  ps          List the tasks of one or more services
  rm          Remove one or more services
  scale       Scale one or multiple replicated services
  update      Update a service

Run 'docker service COMMAND --help' for more information on a command.

docker stack

stack可以理解为一组service

用法如下:

root@fpi-manager-1:/home/docker/baseimages/alpine-tomcat# docker stack

Usage:	docker stack COMMAND

Manage Docker stacks

Options:
      --help   Print usage

Commands:
  deploy      Deploy a new stack or update an existing stack
  ls          List stacks
  ps          List the tasks in the stack
  rm          Remove one or more stacks
  services    List the services in the stack

例如大气网格的测试环境中,前端服务(nginx),后端服务(tomcat),风场服务(windfield),zookeeper+kafka,数据库服务(mysql+mongo)等各种服务可以组合为一个stack

实例

  • 准备两台物理机:192.168.1.48192.168.1.56

  • 192.168.1.48上执行docker swarm init,初始化swarm

  • 192.168.1.56上执行docker swarm join --token xxxxx 192.168.1.48:2377,加入swarm

  • 192.168.1.48上创建文件docker-compose.yml,写入以下内容

version: "3"

services:
  mongo:
    image: mongo:3.4.10
    restart: always
    environment:
      TZ: "Asia/Shanghai"
    ports:
      - "27017:27017"
    volumes:
      - "/home/docker/mongo/data:/data/db"
    deploy:
      replicas: 1
      placement:
        constraints: [node.role == manager]

  mysql:
    image: mysql:5.7.20
    restart: always
    environment:
      TZ: "Asia/Shanghai"
      MYSQL_ROOT_PASSWORD: fpi123456
    ports:
      - "3306:3306"
    volumes:
      - "/etc/localtime:/etc/localtime"
      - "/home/docker/mysql/data:/var/lib/mysql"
    deploy:
      replicas: 1
      placement:
        constraints: [node.role == manager]
        
  tomcat:
    image: alpine-tomcat:8.5.24
    depends_on:
      - "mysql"
      - "mongo"
    restart: always
    environment:
      TZ: "Asia/Shanghai"
    mac_address: 6C:92:BF:4E:0C:C6
    ports:
      - "8081:8080"
    volumes:
      - "/etc/localtime:/etc/localtime"
      - "/home/docker/tomcat/tomcat-agms/logs:/usr/local/tomcat/logs"
      - "/home/docker/tomcat/tomcat-agms/webapps:/usr/local/tomcat/webapps"
      - "/home/docker/tomcat/tomcat-agms/conf:/usr/local/tomcat/conf"
      - "/home/docker/tomcat/tomcat-agms/files:/usr/local/tomcat/files"
      - "/home/docker/tomcat/tomcat-agms/fonts:/usr/share/fonts"
    deploy:
      replicas: 1
      placement:
        constraints: [node.role == manager]

  windfield:
    image: windfield:1.0
    restart: always
    ports:
      - "10000:80"
    deploy:
      replicas: 1
      placement:
        constraints: [node.role == manager]
  • 部署stack,执行命令docker stack deploy -c docker-compose.yml agms,docker会尝试启动docker-compose.yml中定义的一系列服务。

  • 查看当前启动的service:docker service ls

root@fpi-manager-1:/home/docker/agms# docker service ls
ID                  NAME                     MODE                REPLICAS            IMAGE                             PORTS
7qnt4kfp8cwe        web_manager_visualizer   replicated          1/1                 dockersamples/visualizer:latest   *:9001->8080/tcp
kmku1xxfhwbz        agms_windfield           replicated          1/1                 windfield:1.0                     *:10000->80/tcp
llyk3og3xmh5        web_manager_portainer    replicated          1/1                 portainer/portainer:latest        *:9000->9000/tcp
lt0jre2e4exn        agms_tomcat              replicated          1/1                 alpine-tomcat:8.5.24              *:8081->8080/tcp
pjgg5x6q2muc        agms_mysql               replicated          1/1                 mysql:5.7.20                      *:3306->3306/tcp
qpdzwckwtf03        agms_mongo               replicated          1/1                 mongo:3.4.10                      *:27017->27017/tcp
  • 增加service规模
root@fpi-manager-1:/home/docker/agms# docker service scale agms_tomcat=2
agms_tomcat scaled to 2
root@fpi-manager-1:/home/docker/agms# docker service ls
ID                  NAME                     MODE                REPLICAS            IMAGE                             PORTS
7qnt4kfp8cwe        web_manager_visualizer   replicated          1/1                 dockersamples/visualizer:latest   *:9001->8080/tcp
kmku1xxfhwbz        agms_windfield           replicated          1/1                 windfield:1.0                     *:10000->80/tcp
llyk3og3xmh5        web_manager_portainer    replicated          1/1                 portainer/portainer:latest        *:9000->9000/tcp
lt0jre2e4exn        agms_tomcat              replicated          2/2                 alpine-tomcat:8.5.24              *:8081->8080/tcp
pjgg5x6q2muc        agms_mysql               replicated          1/1                 mysql:5.7.20                      *:3306->3306/tcp
qpdzwckwtf03        agms_mongo               replicated          1/1                 mongo:3.4.10                      *:27017->27017/tcp

可以看到agms_tomcatREPLICAS变为2/2

service也可以自动修复,比如删除service对应的一个正在运行的容器,service会尝试重新启动一个新的容器,以保持应用的理想状态(维持相应的规模)。

  • 减少service规模,执行docker service scale agms_tomcat=1

  • 检查负载均衡:访问http://192.168.1.48:8081/agms/web/api/v1/grids/select-tree,可以正常返回结果,将URL替换为http://192.168.1.56:8081/agms/web/api/v1/grids/select-tree,也可以正常返回结果

上次更新: 6/5/2020, 3:22:23 AM